Hands-on_Exercise 3.2 - Programming Animated Statistical Graphics with R

Author

Nguyen Nguyen Ha (Summer)

Published

April 30, 2025

Modified

May 1, 2025

4.1. Overview

When telling a visually driven data story, animated graphics tend to draw more attention and leave a stronger impression than static visuals. In this hands-on section, you’ll learn how to build animated data visualizations using the gganimate and plotly R packages.

You’ll also apply: - tidyr for reshaping and organizing data
- dplyr for data wrangling, filtering, and transformation

Why animation matters

Motion helps highlight patterns over time, build engagement, and illustrate dynamic trends that static plots can’t show effectively.

4.1.1. Basic Concepts of Animation

Animation in R doesn’t make the plot itself move. Instead, it creates many static frames (individual plots), each representing a snapshot in time or state, and stitches them together to simulate motion—just like a flipbook or cartoon.

Each frame is: - A subset of the dataset - A new version of the same plot (but with different values) - Rendered in sequence to create flow

This structure is perfect for showing time series changes, category transitions, or movement through space.

Animation Frame Concept
Note

What drives the animation is a variable used to split the dataset—commonly time (e.g., year, month) or a sequence of states.

Tip

Use transition_*() functions in gganimate like transition_time() or transition_states() to define how the animation progresses between frames.

4.1.2. Terminology

Before diving into the steps for creating animated statistical graphs, it’s essential to understand a few core concepts and terms that define how animation works in data visualization.

Key Terms

  1. Frame
    In an animated plot, each frame represents either a point in time (e.g., a year) or a specific category (e.g., region or class). When the frame changes, the plot updates to reflect new values for that subset of data.

  2. Animation Attributes
    These are settings that control animation behavior, such as:

    • Duration of each frame
    • Easing function for transitions (e.g., linear, cubic-in-out)
    • Whether the animation loops or starts from the current frame
Tip

Before creating animated graphs, ask:
➡️ Does this animation add real value, or is it just visual noise?

For exploratory analysis, static plots might be more efficient.
But for presentations or storytelling, a well-timed animation can boost clarity, engagement, and emotional connection—especially when explaining changes over time.

4.2. Getting Started

4.2.1. Loading the R Packages

Before creating animated visualizations, ensure the following R packages are installed and loaded:

  • plotly: For building interactive statistical graphics.
  • gganimate: A ggplot2 extension for animated plots.
  • gifski: Converts video frames into animated GIFs using high-quality color handling and dithering.
  • gapminder: Contains excerpted data from Gapminder.org, especially useful for country-based animations.
  • tidyverse: A collection of packages for data wrangling, visualization, and transformation.

Use the following code to install and load them efficiently:

pacman::p_load(
  readxl, gifski, gapminder,
  plotly, gganimate, tidyverse
)
Tip

The pacman::p_load() function automatically checks if a package is installed—if not, it installs it before loading. It’s a smart and concise way to manage dependencies.

4.2.2 Importing the Data

This section demonstrates how to import the Data worksheet from the GlobalPopulation.xls Excel workbook and prepare it for animation.

The objective includes: - Reading the Excel file - Converting the Country and Continent columns to factors - Converting the Year column to integer format

Step 1: Initial Version Using mutate_each() (Deprecated)

col <- c("Country", "Continent")
globalPop <- read_xls("data/GlobalPopulation.xls",
                      sheet = "Data") %>%
  mutate_each(funs(factor(.)), col) %>%
  mutate(Year = as.integer(Year))
Things to Learn
  • read_xls() from the readxl package is used to import Excel worksheets.
  • mutate_each() combined with funs() (both deprecated) applies transformations across selected columns.
  • mutate() with as.integer() updates the Year column to numeric format.

Step 2: Updated Version Using mutate_at() To align with newer versions of dplyr, mutate_each() is replaced with mutate_at().

col <- c("Country", "Continent")
globalPop <- read_xls("data/GlobalPopulation.xls",
                      sheet = "Data") %>%
  mutate_at(col, as.factor) %>%
  mutate(Year = as.integer(Year))
Deprecated!

mutate_at() remains functional but is soft-deprecated in favor of the more consistent across() syntax.

Step 3: Final (Recommended) Version Using across() The most up-to-date and robust syntax utilizesmutate() with across().

col <- c("Country", "Continent")
globalPop <- read_xls("data/GlobalPopulation.xls",
                      sheet="Data") %>%
  mutate(across(col, as.factor)) %>%
  mutate(Year = as.integer(Year))
Tip

The across() function is the recommended approach for applying transformations to multiple columns. It ensures compatibility with current and future versions of the tidyverse.

4.3. Animated Data Visualisation: gganimate Methods

The gganimate package extends the grammar of graphics as implemented by ggplot2 to include animation functionality. This is accomplished by introducing new grammar components that control how plots change over time.

Key functions provided by gganimate include:

  • transition_*() — defines how data should spread out and relate to itself over time.
  • view_*() — controls how positional scales change throughout the animation.
  • shadow_*() — displays data from other points in time to provide visual context.
  • enter_*() / exit_*() — manages how new data appears or disappears during the animation.
  • ease_aes() — determines how aesthetics should transition smoothly from one frame to another.

4.3.1. Building a Static Population Bubble Plot

The following code chunk creates a static bubble plot using ggplot2, which forms the foundation for animation:

ggplot(globalPop, aes(x = Old, y = Young,
                      size = Population,
                      colour = Country)) +
  geom_point(alpha = 0.7,
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(title = 'Year: {frame_time}',
       x = '% Aged',
       y = '% Young')

4.3.2. Building the Animated Bubble Plot

This section adds animation to the previously constructed bubble plot using gganimate.

  • transition_time() from gganimate is used to create transitions across distinct time states (e.g., Year).
  • ease_aes() controls the easing of aesthetics during transitions. The default is 'linear', with alternatives including quadratic, cubic, quartic, quintic, sine, circular, exponential, elastic, back, and bounce.

The following code generates the animated bubble chart:

ggplot(globalPop, aes(x = Old, y = Young,
                      size = Population,
                      colour = Country)) +
  geom_point(alpha = 0.7,
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(title = 'Year: {frame_time}',
       x = '% Aged',
       y = '% Young') +
  transition_time(Year) +
  ease_aes('linear')

This chart animates population bubbles over time, with smooth transitions between years using linear easing. The {frame_time} placeholder dynamically updates the plot title for each frame in the animation.

4.4. Animated Data Visualisation: plotly

The plotly R package supports frame-by-frame animations using the frame aesthetic. Both ggplotly() and plot_ly() functions enable key frame transitions and allow specification of the ids aesthetic to preserve object identity across frames.

4.4.1. Building an Animated Bubble Plot: ggplotly() Method

This section demonstrates how to animate a bubble chart using ggplotly(). The animation is driven by the frame aesthetic, which enables transitions between yearly data points.

The animated bubble plot above includes a play/pause button and a slider component for controlling the animation

gg <- ggplot(globalPop,
             aes(x = Old,
                 y = Young,
                 size = Population,
                 colour = Country)) +
  geom_point(aes(size = Population,
                 frame = Year),
             alpha = 0.7,
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(x = '% Aged',
       y = '% Young')

ggplotly(gg)
Things to learn from the code chunk above
  • Appropriate ggplot2 functions are used to create a static bubble plot. The output is then saved as an R object called gg.
  • ggplotly() is then used to convert the R graphic object into an animated svg object.

Notice that although show.legend = FALSE argument was used, the legend still appears on the plot. To overcome this problem,theme(legend.position='none') should be used as shown in the plot and code chunk below.

gg <- ggplot(globalPop, 
       aes(x = Old, 
           y = Young, 
           size = Population, 
           colour = Country)) +
  geom_point(aes(size = Population,
                 frame = Year),
             alpha = 0.7) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(x = '% Aged', 
       y = '% Young') + 
  theme(legend.position='none')

ggplotly(gg)

4.4.2. Building an animated bubble plot: plot_ly() method

In this sub-section, you will learn how to create an animated bubble plot by using plot_ly() method.

bp <- globalPop %>%
  plot_ly(x = ~Old, 
          y = ~Young, 
          size = ~Population, 
          color = ~Continent,
          sizes = c(2, 100),
          frame = ~Year, 
          text = ~Country, 
          hoverinfo = "text",
          type = 'scatter',
          mode = 'markers'
          ) %>%
  layout(showlegend = FALSE)
bp

Practice: Age Distribution Over Time Using ggplot2 and gganimate

a. Using ggplot2

ggplot(globalPop, aes(x = Old, fill = factor(Year))) +
  geom_density(alpha = 0.5) +
  labs(title = "Distribution of Aged Population Over Time",
       x = "% Aged",
       y = "Density") +
  theme_minimal()

Purpose

This non-animated plot provides quantitative insight about shifting population structure. It complements the animated bubble plot by showing statistical distribution rather than motion.

b. Using gganimate

ggplot(globalPop, aes(x = Old, fill = factor(Year))) +
  geom_density(alpha = 0.6, colour = "white") +
  labs(title = "Year: {frame_time}",
       subtitle = "Distribution of % Aged Population",
       x = "% Aged",
       y = "Density") +
  theme_minimal() +
  transition_time(Year) +
  ease_aes('cubic-in-out')

Animated Bonus Insight

This animated density plot highlights demographic shifts in aging. The rightward movement of the peak density over time suggests an increase in elderly population — a key demographic trend that complements the animated bubble plots.

4.5. Reference